Online Control With Least-Squares Methods
ثبت نشده
چکیده
Policy evaluation using least-squares techniques (such as LSTD and iLSTD) have been shown to estimate the value of a policy with far less data than traditional TD techniques. Unfortunately, they make use of policy-dependent statistics that have to be discarded when the policy changes. This makes it difficult to use the techniques for online control problems. In this paper, we explore the effect of policy on the least-squares statistics, distinguishing three fundamental effects. We then introduce the framework of least-squares Sarsa (LSS and iLSS) and empirically evaluate previously suggested approaches for handling data from older policies in the least-squares statistics. We show these approaches can maintain the leastsquares data efficiency in some control problems, identify circumstances where least-squares approaches can be problematic and where special handling of data from older policies improves learning.
منابع مشابه
Combined Estimation and Optimal Control of Batch Membrane Processes
In this paper, we deal with the model-based time-optimal operation of a batch diafiltration process in the presence of membrane fouling. Membrane fouling poses one of the major problems in the field of membrane processes. We model the fouling behavior and estimate its parameters using various methods. Least-squares, least-squares with a moving horizon, recursive least-squares methods and the ex...
متن کاملOnline Controller Tuning via FRIT and Recursive Least-Squares
This paper proposes an online type of controller parameter tuning method by modifying the standard fictitious reference iterative tuning method and by utilizing the so-called recursive least-squares (RLS) algorithm, which can cope with variation of plant characteristics adaptively. As used in many applications, the RLS algorithm with a forgetting factor is also applied to give more weight to mo...
متن کاملOnline Tuning Strategy for Multi-loop SISO PI Control Algorithms in Multivariable Interactive Systems
Tuning of PI control algorithms for coupled multi input multi output (MIMO) systems is a challenging problem. This paper extends a previously developed model-based adaptive tuning method to handle the tuning problem of coupled multivariable systems. The performance of the proposed method is compared to those of existing methods such as Biggest Log Modulus (BLT), Sequential Loop Closing (SLC) an...
متن کاملSimultaneous Model Predictive Control and Identification: Closed-Loop Properties
Model Predictive Control and Identification is an adaptive control technique which solves an online optimization problem to find process inputs for dual control problem. Its main goal is to bring robustness, to Model Predictive Control, increase the capability to handle uncertainties and time varying parameters in the processes. Theoretical properties, such as feasibility of the optimization pr...
متن کاملLeast-squares methods for policy iteration
Approximate reinforcement learning deals with the essential problem of applying reinforcement learning in large and continuous state-action spaces, by using function approximators to represent the solution. This chapter reviews least-squares methods for policy iteration, an important class of algorithms for approximate reinforcement learning. We discuss three techniques for solving the core, po...
متن کامل